Goto

Collaborating Authors

 graph analysis



Relation Extraction Across Entire Books to Reconstruct Community Networks: The AffilKG Datasets

arXiv.org Artificial Intelligence

When knowledge graphs (KGs) are automatically extracted from text, are they accurate enough for downstream analysis? Unfortunately, current annotated datasets can not be used to evaluate this question, since their KGs are highly disconnected, too small, or overly complex. To address this gap, we introduce AffilKG (https://doi.org/10.5281/zenodo.15427977), which is a collection of six datasets that are the first to pair complete book scans with large, labeled knowledge graphs. Each dataset features affiliation graphs, which are simple KGs that capture Member relationships between Person and Organization entities -- useful in studies of migration, community interactions, and other social phenomena. In addition, three datasets include expanded KGs with a wider variety of relation types. Our preliminary experiments demonstrate significant variability in model performance across datasets, underscoring AffilKG's ability to enable two critical advances: (1) benchmarking how extraction errors propagate to graph-level analyses (e.g., community structure), and (2) validating KG extraction methods for real-world social science research.


Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies

arXiv.org Artificial Intelligence

Deep reinforcement learning (RL) policies can demonstrate unsafe behaviors and are challenging to interpret. To address these challenges, we combine RL policy model checking--a technique for determining whether RL policies exhibit unsafe behaviors--with co-activation graph analysis--a method that maps neural network inner workings by analyzing neuron activation patterns--to gain insight into the safe RL policy's sequential decision-making. This combination lets us interpret the RL policy's inner workings for safe decision-making. We demonstrate its applicability in various experiments.


VisGraphVar: A Benchmark Generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models

arXiv.org Artificial Intelligence

The fast advancement of Large Vision-Language Models (LVLMs) has shown immense potential. These models are increasingly capable of tackling abstract visual tasks. Geometric structures, particularly graphs with their inherent flexibility and complexity, serve as an excellent benchmark for evaluating these models' predictive capabilities. While human observers can readily identify subtle visual details and perform accurate analyses, our investigation reveals that state-of-the-art LVLMs exhibit consistent limitations in specific visual graph scenarios, especially when confronted with stylistic variations. In response to these challenges, we introduce VisGraphVar (Visual Graph Variability), a customizable benchmark generator able to produce graph images for seven distinct task categories (detection, classification, segmentation, pattern recognition, link prediction, reasoning, matching), designed to systematically evaluate the strengths and limitations of individual LVLMs. We use VisGraphVar to produce 990 graph images and evaluate six LVLMs, employing two distinct prompting strategies, namely zero-shot and chain-of-thought. The findings demonstrate that variations in visual attributes of images (e.g., node labeling and layout) and the deliberate inclusion of visual imperfections, such as overlapping nodes, significantly affect model performance. This research emphasizes the importance of a comprehensive evaluation across graph-related tasks, extending beyond reasoning alone. VisGraphVar offers valuable insights to guide the development of more reliable and robust systems capable of performing advanced visual graph analysis.


Graph Analysis Using a GPU-based Parallel Algorithm: Quantum Clustering

arXiv.org Artificial Intelligence

Graph Clustering, also known as network clustering, is a technique for partitioning a graph into clusters or communities of nodes based on their structural properties[1]. Graph clustering is used in various applications such as social network analysis, image segmentation, bioinformatics, and more. The goal of graph clustering is to group the nodes in a way to maximizes the similarity within the group and minimizes the similarity between them[2]. These two similarities are usually measured using various metrics such as modularity, Normalized Mutual Information(NMI), Adjusted Rand Index(ARI) and FowlkesMallows Index(FMI).


Segment Anything in Non-Euclidean Domains: Challenges and Opportunities

arXiv.org Artificial Intelligence

The recent work known as Segment Anything (SA) has made significant strides in pushing the boundaries of semantic segmentation into the era of foundation models. The impact of SA has sparked extremely active discussions and ushered in an encouraging new wave of developing foundation models for the diverse tasks in the Euclidean domain, such as object detection and image inpainting. Despite the promising advances led by SA, the concept has yet to be extended to the non-Euclidean graph domain. In this paper, we explore a novel Segment Non-Euclidean Anything (SNA) paradigm that strives to develop foundation models that can handle the diverse range of graph data within the non-Euclidean domain, seeking to expand the scope of SA and lay the groundwork for future research in this direction. To achieve this goal, we begin by discussing the recent achievements in foundation models associated with SA. We then shed light on the unique challenges that arise when applying the SA concept to graph analysis, which involves understanding the differences between the Euclidean and non-Euclidean domains from both the data and task perspectives. Motivated by these observations, we present several preliminary solutions to tackle the challenges of SNA and detail their corresponding limitations, along with several potential directions to pave the way for future SNA research. Experiments on five Open Graph Benchmark (OGB) datasets across various tasks, including graph property classification and regression, as well as multi-label prediction, demonstrate that the performance of the naive SNA solutions has considerable room for improvement, pointing towards a promising avenue for future exploration of Graph General Intelligence.


Robots are coming for the lawyers โ€“ which may be bad for tomorrow's attorneys but great for anyone in need of cheap legal assistance

#artificialintelligence

Imagine what a lawyer does on a given day: researching cases, drafting briefs, advising clients. While technology has been nibbling around the edges of the legal profession for some time, it's hard to imagine those complex tasks being done by a robot. And it is those complicated, personalized tasks that have led technologists to include lawyers in a broader category of jobs that are considered pretty safe from a future of advanced robotics and artificial intelligence. But, as we discovered in a recent research collaboration to analyze legal briefs using a branch of artificial intelligence known as machine learning, lawyers' jobs are a lot less safe than we thought. It turns out that you don't need to completely automate a job to fundamentally change it.


Using Data To Combat COVID-19

#artificialintelligence

Let's examine the communication breakdown from a more human perspective: Reporting numbers and statistics is only half the battle. The person receiving the information has to decode, interpret, and decide for themselves what this information means to them. This includes exacerbating factors like confirmation bias, and the inaccessibility of credible information (or, put another way, the relative ease with which one can find misinformation). Confirmation bias is a type of cognitive bias where information that is consistent with our existing beliefs is weighted with greater importance and is more likely to be remembered than information that is inconsistent with our existing beliefs. For example, if you prefer the colour red to blue, and there are equally valid studies published on why red is better and why blue is better, you're more likely to believe and remember the study that supports your love of the colour red.


Crunchbase network analysis with Python

#artificialintelligence

This project (code, data, and results) is publicly available on Domino. Crunchbase recently converted its backend database to a Neo4j graph database. This will give it great flexibility in the future, but for now, the data is exposed similarly to how it always has been: individual entities are retrieved and attribute data must be used to form edges between them prior to any graph analysis. Aside from traversing links manually on the web pages, there are no provisions for graph analysis. To enable more powerful manipulations of this data, during my time at Zipfian Academy, I created my "Visibly Connected" project.


Nvidia Rapids cuGraph: Making graph analysis ubiquitous ZDNet

#artificialintelligence

A new open-source library by Nvidia could be the secret ingredient to advancing analytics and making graph databases faster. Nvidia has long ago stopped being "just" a hardware company. As its hardware is what much of the compute supporting the explosion in AI runs on, Nvidia has taken upon itself the task of paving the last mile to the software. Nvidia does this by developing and releasing libraries that software developers and data scientists can use to integrate GPU power in their work. The premise is simple: Not everyone is a specialist in parallelism or wants to be one.